Rank in Wordlist | Word | Rank in Wordlist | Word |
---|---|---|---|
1 | і | 26 | аж |
2 | ся | 27 | о |
3 | є | 28 | лем |
4 | в | 29 | мають |
5 | на | 30 | же |
6 | з | 31 | жытелїв |
7 | у | 32 | На |
8 | до | 33 | ці |
9 | як | 34 | котры |
10 | суть | 35 | Мать |
11 | або | 36 | міджі |
12 | В | 37 | ёго |
13 | од | 38 | року |
14 | — | 39 | роцї |
15 | по | 40 | із |
16 | не | 41 | птахы |
17 | а | 42 | їх |
18 | мать | 43 | то |
19 | тыж | 44 | – |
20 | але | 45 | км² |
21 | быв | 46 | цм |
22 | про | 47 | Є |
23 | Подїї | 48 | часто |
24 | за | 49 | што |
25 | Народили | 50 | котрый |
The table shows the top-50 words of the corpus. Usually we see stopwords.
Language: Afrikaans
This list is a good candidate for a first stopword list for a language.
Usually a small, balanced corpus is enough to get a good list of high frequent words. But if the small corpus has some very prominent topic, this will be visible even in the top word lists.
select w_id-100 as rank_in_wordlist, word from words where w_id>100 order by w_id limit 50;
3.4 Sample words for different frequency ranges